Identification of Personal Name Aliases on the Web
نویسندگان
چکیده
Extracting aliases of an entity is important for various tasks such as identification of relations among entities, web search and entity disambiguation. To extract relations among entities properly, one must first identify those entities. We propose a novel approach to find aliases of a given name using automatically extracted lexical patterns. We exploit a set of known names and their aliases as training data and extract lexical patterns that convey information related to aliases of names from text snippets returned by a web search engine. The patterns are then used to find candidate aliases of a given name. We use anchor texts to design a word cooccurrence model and use it to define various ranking scores to measure the association between a name and a candidate alias. The ranking scores are integrated with page-countbased association measures using support vector machines to leverage a robust alias detection method. The proposed method outperforms numerous baselines and previous work on alias extraction on a dataset of personal names, achieving a statistically significant mean reciprocal rank of 0.6718. Experiments carried out using a dataset of location names and Japanese personal names suggest the possibility of extending the proposed method to extract aliases for different types of named entities and for other languages. Moreover, the aliases extracted using the proposed method improve recall by 20% in a relation-detection task.
منابع مشابه
Automatically Extracting Personal Name Aliases from the Web
An entity can be referred by multiple name aliases on the web. Extracting aliases of an entity is important for various tasks such as identification of relations among entities, automatic metadata extraction and entity disambiguation. To extract relations among entities properly, one must first identify those entities. Aliases of an entity are useful as metadata for that entity and can be used ...
متن کاملAutomatically Extracting Personal Name Aliases from the Web
Extracting aliases of an entity is important for various tasks such as identification of relations among entities, web search and entity disambiguation. To extract relations among entities properly, one must first identify those entities. We propose a novel approach to find aliases of a given name using automatically extracted lexical patterns. We exploit a set of known names and their aliases ...
متن کاملA Co-occurrence Graph-based Approach for Personal Name Alias Extraction from Anchor Texts
A person may have multiple name aliases on the Web. Identifying aliases of a name is important for various tasks such as information retrieval, sentiment analysis and name disambiguation. We introduce the notion of a word co-occurrence graph to represent the mutual relations between words that appear in anchor texts. Words in anchor texts are represented as nodes in the co-occurrence graph and ...
متن کاملAutomatic Discovery of Lexical Patterns using Pattern Extraction Algorithm to Identify Personal Name Aliases with Entities
The personal name aliases are extremely significant in information retrieval to retrieve complete information about a personal name from the web, as some of the web pages of the person may also be referred by his or her alias name / nick name / real name. There is a rapid growth in people searching where the personal name aliases are concerned. We proposed a pattern generator which includes aut...
متن کاملAutomatic Detection of Name Disambiguation and Extracting Aliases for the Personal Name
An individual can be referred by multiple name aliases on the web. Extracting aliases of a name is important in information retrieval, sentiment analysis and name disambiguation. We propose a novel approach to find aliases of a given name using automatically extracted lexical pattern based approach. We exploit set of known names and their aliases as training data and extract lexical patterns th...
متن کامل